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THE PROBLEM 


Investigate the probability density distribution of the ampli- 
tude of ocean ambient noise and ship noise; determine any 
differences in the distributions which might lead to the 
identification of ship noise masked by a high background- 
noise level. Also, determine, by standard statistical 
methods, whether the distributions are gaussian or non- 
gaussian. 


RESULTS 


1. Ambient ocean noise was found to have a gaussian 
distribution of amplitudes (in the sense that the moments of 
the distribution satisfied specific tests) only when the am- 
bient noise was relatively clean, i.e., the noise did not 
contain high-level ship noise, biological noise, ice noise or 
any of the other extraneous noises discussed in the text. 


2. The group of ship-noise samples recorded at 
close range contained a large number of samples that had a 
non-gaussian distribution. However the other types of ex- 
traneous noises were found to cause the same kind of de- 
viation from a gaussian distribution, so that it was not 
possible by these tests to distinguish between a sample 
with ship noise and a sample with the other types of ex- 
traneous noises (such as biological and ice noise), 
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RECOMMENDATIONS 


1. Use the method of moments described here if 
better accuracy than that given by overlays is desired to 
estimate the moments and to determine whether a sample 
is gaussian or non-gausSian. 


2. Inthe probability density analysis of a noise 
Sample, use a range of amplitudes covering at least +4 
standard deviations; otherwise large errors in the estimates 
of the moments will frequently result. 


3. In future applications of the PDA, have the output 
of the PDA in a digital form rather than a continuous curve 
so that the data will be available in a form more suitable 
for the calculation of the moments of the distribution. 
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INTRODUCTION 


The study reported here was undertaken to investigate 
the probability density distribution of the amplitudes of 
ocean ambient noise and ship noise with respect to various 
bandwidths in several frequency ranges. The question to 
be answered was whether ambient noise, without any ship 
noise or biological noises, can be considered gaussian, and 
whether the presence of ship noise significantly changes the 
probability of density distributions. A secondary objective 
was to investigate methods of data reduction of the probability 
density curves obtained with the B& K Probability Density 
Analyzer, using standard statistical tests. 


The probability density function, as treated throughout 
this report, may be defined as follows. 
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where x is a random variable, with its range of values 
divided into a large number of continuous intervals Ax. 
Measure its instantaneous value a great number of times /’. 
Let 7. be the number of measured values of x in the 7th 
interval (Ax). 


The above equation can be rewritten as 
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where A ts is the amount of time the signal spends in the 
interval Ax, and J is the total time of the sample. Equation 
2 indicates more clearly how the B & K PDA measures the 
probability density function. A more detailed explanation 
can be found in reference 1. (See list of references at end 
of report. ) 
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The function 


where X is the mean and go is the standard deviation, is 
illustrated in figure 1. 
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TEST PROGRAM 


Instrumentation 


The equipment used for the investigation is described 
below and illustrated in figure 2. 


An Ampex Model 350 was used as the record and play- 
back recorder. This model has a good low-frequency 
response to below 20 c/s. 


The filters following the recorder were an Allison 
Laboratories Model 2-A (used mainly as a low-pass filter) 
and a B & K Band Pass Filter Set, Type 1611. 


A McIntosh amplifier, Model MC30, was used to 
raise the signal level to 1 volt rms or greater. 


The B & K Probability Density Analyzer, Model 160 
(to be referred to as the PDA) was the main piece of equip- 
ment and has been primarily designed to obtain the prob- 
ability density curves of disturbances that are essentially 
random in character. A brief description of the PDA and 
its use in this investigation is given in Appendix A. A com- 
plete and detailed description of the PDA can be obtained 
from the instruction manual. ? 


ALLISON LABS FILTER McINTOSH B&K 
AMPEX 350 MODEL 2-A Heenan PROBABILITY DENSITY VARI PLOTTER 
PLAYBACK UNIT OR MODEL MC30 ANALYZER XY RECORDER 


B &K, TYPE 1611 MODEL 160 


COUNTER 


Figure 2. Block diagram of Probability Density Analysis system. 


An XY recorder by Electronic Associates, Inc., was 
used to record the analog 4 and Y outputs of the PDA. 


A cathode ray oscilloscope monitored the signal out- 
put of the filter. 


The counter used responded to frequencies of at least 
10 Mc/s for use with the PDA. The counter can be used in 
place of an XY recorder and, in fact, is essential if 
measurements are to be made at low probability densities. 


Research Techniques 


Data which had been recorded for previous ambient- 
noise studies were available for this study. These samples 
had been recorded on 10-inch reels of g-inch tape, at 32 
inches per second, and were from three locations. Two 
groups had been made in shallow water -- one, about 2 miles 
from the western side of an island off the coast of Southern 
California, and the other in the Bering Straits. These con- 
sisted of short ambient-noise samples recorded at regular 
intervals throughout the day, so that one reel covered data 
for one day. The third location represented was in deep 
water in the North Pacific between Hawaii and Alaska; most 
of these samples were of longer duration than the other two 
groups, but covered only a few days. 


Samples of ship noise were desired, so that their 
probability density curves might be compared with those of 
"clean'' ambient noise. Recordings were made of ships 
entering San Diego Harbor, with the sampling made at ap- 
proximately the closest point of approach. These included 
Navy surface ships, submarines (surfaced), and commercial 
ships. 


Several factors were considered in choosing the data 
samples to be used in this study. 


1. ''Clean'' ambient noise was used to determine 
whether the distributions of the amplitudes were gaussian 


or near-gaussian according to certain tests which will be 
discussed later. Ambient noise was judged to be ''clean'' 
when it was free from ship noise, biological noises, or any 
man-made sounds when the sample was monitored. A band- 
pass filter and oscilloscope were used to determine whether 
60-c/s hum or any other single frequency components were 
present in the noise sample. 


2. All noise samples should be stationary for their 
entire length. When the sample is ambient ocean noise, 
this condition will not in general be true. For a noise 
sample to be stationary it is necessary for the sample 
parameters, the means and the variances, to remain un- 
changed as measured from samples taken at different times. 
It is possible that no significant difference in the sample 
parameters will be found if the time between samples is 
short enough. In a previous study* it was concluded that 
ocean noise is a Slowly varying, not a stationary, process. 
This conclusion was based on a comparison of samples that 
were 3 or more minutes apart. However, no significant 
difference was found among the values of some other samples 
which were only 3 minutes or less apart. Thus it appears 
reasonable to assume that ocean noise is stationary during 
a short interval of time (less than 3 minutes). 


3. The PDA requires a noise sample of about 30 
minutes duration for a complete automatic analysis of the 
amplitudes from -3.00 to +3. 00 standard deviations. 


The need for a long noise sample that is stationary 
can be satisfied by recording a short noise sample on mag- 
netic tape and then making a loop of the tape. A loop length 
was selected according to the following requirements. 


a. The loop should be short enough so that the noise 
could be considered stationary and so that the entire loop 
could be analyzed for each amplitude interval. The PDA 
(in the particular position used) requires 30 seconds to 
sweep a range of amplitudes equal to the window width, 
which is 0.1 times the rms value of the input signal. A 
sample length of 7 seconds met all the above requirements 


10 


and this gives a loop size of 52.5 inches, which was 
conveniently handled. 


b. The recorded noise on the loop should be con- 
tinuous, i.e., there should be no blank intervals on the 
loop, since a blank interval would change the average rms 
value of the recorded noise. 


A typical analysis procedure was as follows. A portion 
of data was selected for analysis from the recorded data 
available. The noise was re-recorded on a loop. The loop 
was played back at 7% ips and the analysis proceeded as 
indicated by the diagram in figure 2. The filter was set to 
the desired bandwidth, and the noise was amplified to 1 volt 
rms or greater. The PDA was carefully calibrated and 
adjusted just before each analysis. Its input level of noise 
was adjusted to 1 volt rms by its potentiometer, thus 
normalizing its output. 


Probability density of the amplitudes was recorded on 
the Y scale of the Y recorder and the amplitude around 
which the probability density was measured was on the X 
scale. Scale factors were selected to give a deflection of 
4 inches on the / scale for a probability density range of 0 
to 0.4, and a deflection of 1 inch per standard deviation of 
amplitude on the X scale. The automatic sweep time of the 
PDA was set at X = -3.00 standard deviations, and would 
automatically sweep through to 4 = +3.00 standard deviations, 
based on a 1-volt rms input. Total running time was about 
30 minutes. This procedure was repeated for each band- 
width on every loop analyzed. 


Table 1 lists the number of samples analyzed from 
each location, the total number of probability density 
curves obtained from the samples, and the filter used to 
analyze these curves. When the Allison Laboratories filter 
was used, the system cutoff frequency at the low end was 
about 20 c/s and the upper cutoff frequency was determined 
by the filter which was set at 2500, 1500, 1200, 600, 300, 
or 150 c/s. The B & K filter was used in both the octave 


and third-octave positions for center band frequencies of 
100, 200, 400, 800, and 1600 c/s. 


TABLE 1. NOISE SAMPLES SELECTED FOR ANALYSES, BY LOCATION. 
(FOR THE BANDWIDTHS USED, SEE ABOVE) 


NUMBER OF 
NOISE SAMPLES 


NUMBER OF P D 
CURVES OBTAINED 


LOCATION FILTER USED FOR ANALY SIS OF DATA 


SOUTHERN 
CALIFORNIA 


8 SAMPLES WITH ALLISON LABS FILTER; 
1 SAMPLE WITH ALLISON LABS AND B &K 


BERING STRAITS 


ALLISON LABS 
NORTH PACIFIC 


SAN DIEGO 
(SHIP NOISE 
IN HARBOR) 


ALLISON LABS 


Actual probability density curves of ambient noise 
are shown in figures 3 and 4. The large fluctuations in 
some of the traces are caused by substantial variations in 
the level of the noise sample. Since some of the curves 
appeared to be closely gaussian, the methods used to 
measure the parameters of the distribution included over- 
lays, calculated moments, and cumulative probability 
graphs. Tests of significance and the chi-square ''good- 
ness of fit'' tests were used to determine what values of 
skewness and kurtosis were improbable at a 5 or 1 per cent 
probability level. 
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Figure 3. Examples of some PD curves taken 


shallow water, 


compared with a@ normal curve. 


in 


PROBABILITY DENSITY 
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SAN DIEGO HARBOR (Ship Noise) 


=I 0 1 2 3 
AMPLITUDE IN STANDARD DEVIATION UNITS 


Examples of some PD curves taken in 


both shallow and deep water, compared with a 
normal Curve. 
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Data Reduction Techniques 


OVERLAY METHOD 


Since it was expected that the probability density 
curves obtained with the PDA would have a gaussian or 
nearly gaussian distribution, an overlay with a gaussian 
curve was used. The curve had parameters of a mean 
equal to zero and a standard deviation equal to one. Figure 
5 illustrates the use of this method with two curves, one 
judged to be gaussian and the other non-gaussian. Some 
probability density curves obtained with the PDA were 
judged to be very nearly gaussian. 


One disadvantage of the overlay method is that de- 
cisions about how well a particular curve compares with 
the overlay are purely subjective. Skewness and kurtosis 
can be detected, but the magnitudes of these moments cannot 
be estimated with accuracy. An extension of the overlay 
method which will allow estimates of skewness and kurtosis 
15} CESCiEMHSNC) |S, 


The extension is an overlay with several curves in- 
stead of just one. Each curve has a different set of values 
for skewness and kurtosis. The curves are positioned 
over the actual probability density curve and the parameters 
are estimated by interpolation between the two closest 
curves. The curves of the overlay can be computed with 
the use of Edgeworth's series approximation for nearly 
gaussian distributions.* The first four terms of this series 
are 


Oe) * 


gq. g, 
Gea) = h(x) — Feo) + FA (xc) + h.© (x) 
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where f(x) is the normalized gaussian distribution, nh” (cc) 
is the nth derivative of h(x), Gg. is the standardized skew- 
ness, and 95 is the standardized kurtosis. 
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were determined to be gaussian or non-gaussiany 
using @ normal curve as an overlay. 
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Estimates of the skewness and kurtosis can be found 
with the above method; but it does not give any indication of 
whether these estimates are significantly different from the 
expected values, if the sample is taken from a gaussian 
distribution. Using the previous overlay, a method can be 
developed so that a sample can be accepted or rejected at 
any desired level of probability. Basically the method is 
to have two of the curves on the overlay plotted so that they 
will represent the maximum deviations allowed in the par- 
ticular parameter of a sample with (J) points. The method 
will be developed for kurtosis, but a similar method can be 
used for skewness. 


The variance of kurtosis is given by* 


weir.) = 24/N (4) 


for large 7. This holds for a sample taken from a normal 
parent population, The standard deviation of kurtosis is 
(24/) 2; if the kurtosis is distributed normally, then from 
the ratio of a particular value of kurtosis (9, ‘) and the 
standard deviation we can obtain the probability of getting a 
value of kurtosis as large or larger thang, ‘, The ratio is 


/ 
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The probability of getting a value of kurtosis as large as or 
larger than g, “is given by the amount of area under a 
normal curve outside the -f and +f standard deviations. 

A value of 8 = 1.96 corresponds to a probability level of 


PROBABILITY DENSITY 


Figure 6. Overlay indicating g,‘ of +9.50 and of =0.5@- 


5 per cent, or 1/20th the total area. A ratio as large as 
1.96 may be considered sufficiently improbable and hence 
Io ‘ can be assumed to result from a non-gaussian distribution. 
The sample would therefore be rejected as coming from a 
gaussian distribution. The value of 9, ‘ therefore depends 
Gin Wl, Gi,” FF We 96(24/W”)&. Edgeworth's series would then 
be used to compute two curves, one with -g, ’ (for negative 
kurtosis) and one with +g, ‘ (for positive kurtosis). These 
curves would represent the limits, at a 5 per cent 
probability level, within which a sample of V points would be 
considered as coming from a gaussian distribution. 


Figure 6 shows two curves as they would appear in 
the overlay. These two curves are the limits for a sample 
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A curve having a value of kurtosis as IBPOG OF LOrPEe it 
than these values will be non-gaussian at a@ 5 per cent 
level for a sample of 370 points OTs, equivalently, a 
pandwidth of about 55 c/s. 
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of bandwidth about 55 c/s, with V given by the equation / = 
Grad I? where i is the bandwidth. The equation is obtained 
from Appendix B, using a time constant 7 = 2.3 seconds. 


The overlay method was not used extensively because 
of the complexity that comes from considering different 
values of / and also different combinations of skewness and 
kurtosis in the same sample. A method using computed 
moments of the curves is described next; it was felt that 
this method would yield accurate values of the mean, 
standard deviation, skewness, and kurtosis. 


METHOD OF MOMENTS 


The method of moments is basically a general method 
of forming estimates of the parameters of a distribution by 
means of a set of measured sample values. The first few 
moments of the actual distribution are calculated and these 


are used as estimates of the moments of the parent population. 


On the basis of these moments a suitable theoretical dis- 
tribution curve is selected. For any particular distribution 
curve the moments are functions of the parameters of that 
curve. The parameters are determined and tests of sig- 
nificance are made on the skewness and kurtosis. 


The moments about the origin are defined as” 


ae ir 
7 = > Pole, 

U 
where p;(x) is the probability that a value selected at ran- 
dom from the population will lie in the tth class. The 
variate x with which we are concerned may be discrete or 


continuous. 


The moment 


(6) 
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is defined as the mean value of x, m, Tes oe 


Another more important set of moments is obtained 
by changing the origin to the arithmetic mean. Equation 8 
defines the moments about the mean. 
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m= Gc) (x, - 2)" (8) 
Bey eel)\es 
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For computing purposes, the relations between the 


mand the m_’ are convenient. Expressing the lites in terms 
of the ™_ ‘ we have the relations 


m, =m,’ (9a) 
iis = ida = (ae De (9b) 
Mz, = M,' - 3Mgm,’ + 2(m, *)? (9c) 
i, = iD,” 2 Bil “Wa” =P Oils Wi, 2 Sita, (9d) 


Grouping errors are negligible, so Sheppard's 
corrections are not applied. 


— These moments can be expressed in standard units 
by the use of a standardized variable 2, by dividing the 


variable x by oe the standard deviation. 


id (x-x) 


Zz (10) 
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The standardized moments are defined by the equations 
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The first four standardized moments are 


Gi. = O (12a) 
Qe = I (12b) 
m 
Chg =. (12c) 
s 3 
x 
m 
Oy Aas (12d) 
Ss & 
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The third moment, @,, is a measure of the skewness of the 
distribution. A positive value indicates a distribution with 
a longer positive tail than a negative tail. 


The fourth standardized moment, @~,, is a measure 
of the kurtosis of the distribution. In some cases it is a 
measure of the ''peakedness"' of the distribution, though it 
is now understood that the length and size of the tails are 
very important in this measurement. 


For a normal curve the values of ~, anda@, will be 
O and 3, respectively. We redefine the skewness and kur- 
tosis as 


Oe = ANG (13a) 


9, = % - 3 (13b) 


so that g. is 0 for a normal curve. 


It is not very likely that the third and fourth moments 
of a random sample will be zero. Depending on the distri- 
bution and on the actual sample values, the third and fourth 
moments will have some value different from zero. To de- 
termine whether this difference is significant, it is neces- 
sary to use the variances of the third and fourth moments. * 


var(g,) = 6¥(W-1)(W-2)~ 1(V-1)~*(W-3)7 + (14a) 


var(g,) = 24N(W-1)? (W-3)7+(W-2)7 *(W-3)~ * @-5)"* (14b) 
For large JV use, 
var(g,) = 6/W (15a) 
var(g,) = 24/M (1 5b) 
The hypothesis to be tested is that the data sample is 


taken from a gaussian distribution. To test the hypothesis 
compare g, to (6 /N) 2 and G, WO (24/ I) (see ref. 5), then 


g, 


if > 1.96 reject the hypothesis at the 5 per cent level 


Al. 
(6 /i)® 
> 2.57 reject the hypothesis at the 1 per cent level. 


Similarly, for OE 


9. 


if > 1.96 reject the hypothesis at the 5 per cent level 
(24/1)2 
> 2.57 reject the hypothesis at the 1 per cent ewe. 


CHI-SQUARE ''GOODNESS OF FIT" TEST 


The y* test will be applied to the hypothesis that a 
sample of individuals forms a random sample from a 
population with a given probability distribution. The param- 
eters of a distribution are known and are not estimated 
from the sample itself. Later a modification will be given 
for the situation where the parameters are estimated from 


the sample. 
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The quantity* 


is a measure of the deviation of the sample from the ex- 
pectation, where /’, is the number of observed frequencies 
in the 7th interval, and Vp, is the number of expected fre- 
quencies in the 7th interval as predicted by the theoretical 
distribution. Karl Pearson proved that the above quantity, 
in the limit, is the ordinary y* distribution which is now 
tabulated in most statistics books. 


The y* computed with equation 16 is compared with 
Ss 


the 5 per cent point for (4-1) degrees of freedom from a val 
distribution table. The tabulated value of xy” at the 5 per 
cent probability level with 29 degrees of freedom is 42.6. 
Now, if y* , as calculated by equation 16, is greater than 


42.6, then the hypothesis is rejected by this test; that is, 
the sample is non-gaussian. 


The application of the ee test to the data was as fol- 
lows (fig. 7). Let f(x) represent a probability density 
curve obtained from the PDA. Divide the curve into 30 
TMCIAVEMNS Iieon 59 = —3,0 vow = 16 W5 ILS é. be the mid- 
point of Ax. , one of the intervals, and INE 5) the value of 
the probability density at €,. The area under the curve is 
then estimated by A, ‘, where A ae = F(E,)Ax. Let A. - 

Zt 
| ~(z)dz, where ~(z) is the theoretical probability 


zZ, 
t-1 


* See ref. 5, pp. 197-200. 


(16) 


F(x) 


PEG EG Ho Curve obtained with 
one PDA (jo = 9¢(ee)) BSc MEER 
sured probability density at 
&. (fF(E;)). The area within 
the rectangle (A,’) is an 
estimate of the area under the 
curve for the interval 


f(é,) 


Ax =x -x ; 
z Cel 


i-1 


f(€;) : MEASURED PD AT &, 
p = f(x) : CURVE OBTAINED WITH PDA 


density distribution with which the experimental curve is 
being compared. Also let Ny = NA’, and Be = NA ,. 


(n,-e,¥ 
ee a epee rar (17) 
Ss : D 
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WV is estimated according to the method described in 
Appendix B. 
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The only modification needed when the parameters 
are estimated from the sample itself is a reduction in the 
number of degrees of freedom by one for each parameter 
that is estimated from the sample. For example, ifa 
gaussian distribution is assumed and the mean and standard 
deviation are estimated from the sample, then the number 
of degrees of freedom are (4-1)-2, where % is the number 
of groups. 


CUMULATIVE PROBABILITY PLOTS 


The use of cumulative probability paper was also in- 
vestigated. On this type of plot a gaussian distribution is 
represented by a straight line. The data are normalized 
and plotted, and deviations from a gaussian distribution are 
seen as departures from a straight line. The data are the 
p,(€,) used for the method of moments where é, is the mid- 


(DOMAI, Oi AIF, = 83 he ae The points plotted are the normalized 


cumulative sums, i.e., the first point is 


DB, We) 
op, (é,) 


the second point is 


9,3)? 2. (E) rere 


210, (3) 


INNS GAINS VISSC! WS weOna oe = —So00 tos = +3. 00 So 
that a gaussian curve resembles figure 8. The curve de- 
viates from a straight line at the ends because of the small 
amount of area (0.27 per cent) outside three standard de- 
viations. This curve should be used for comparison with 
the data instead of the straight line. 
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Figure 8. Normalized cumulative sums of 
tabulated values of the area of a normal 
ChrOe fOr OBHUGS Of 36 EeWeen S3I6OO wpe 
#3200 standard deviations. 
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The type of deviations that would result for curves 
with skewness and kurtosis are shown in figures 9 and 10. 
The distribution curves for certain amounts of skewness 
and kurtosis are computed using Edgeworth's first four 
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Figure 9. Curves illustrating positive and negative 
skewness, computed using HEdgeworth’s series. 
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terms, equation 4. The f(x) were computed using a mean 
of zero and a standard deviation of one. For skewness, 


Che +0.24, g = O were used. For kurtosis, g, = 0, 
g, = +0.48 were used. 
99.99 
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Figure 10. Curves illustrating positive and negative 
kurtosis, computed using Hdgeworth's series. 
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Several cumulative probability examples are shown 


in figures 11 and 12. 
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Figure 12. Cumulative probability of 
noise samples shown in figure 4. 
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RESULTS 


An overlay of a normal curve was constructed and used 
on the probability density curves obtained with the 4 
recorder. This method was found to have a limited useful- 
ness since it would not yield results with the desired amount 
of accuracy in all cases. Cases where it could be used are 
for curves which do not deviate very much from the overlay. 
As the deviations become greater it becomes more difficult 
to estimate the parameters accurately. It was decided not 
to use an overlay with more than one curve (as described in 
"Data Reduction Techniques'') because, even though the 
curves were normalized to a standard deviation of 1, the 
error in setting the input to a value of 1 volt rms can be as 
mauchyashOsperIcentn although in most cases it was less 
than 5 per cent. Such error makes it difficult to estimate 
the skewness and especially the kurtosis, since the overlay 
will not fit if the standard deviation is other than 1. 


With the method of moments it is possible to get 
estimates of the mean, the standard deviation, the skewness, 
and the kurtosis. The limited range of amplitudes analyzed 
(which was thought to be sufficient before the data reduction) 
causes an error in the calculated moments for those samples 
which have amplitudes extending beyond three standard de- 
viations. The error is more evident in the higher (3rd and 
4th) moments, because of the higher powers of x used in the 
calculations of these moments. A correction can be applied 
to the computed skewness and kurtosis. The correction 
takes into account the ''ignored'’ amplitudes, i.e., it com- 
pares the computed moment with the moment of a truncated 
normal distribution. Actually only the moment of kurtosis 
was corrected, since skewness was not found very fre- 
quently among the noise samples. 


The hypothesis to be tested is that the noise samples 
are taken from noise with a gaussian distribution of ampli- 
tudes. The test used is the same one described on page 21 
and it is applied to both the skewness and the kurtosis. The 


hypothesis is then rejected by the test if both moments of the 
sample are significant at the 5 per cent level, and also if 
either moment is significant at the 1 per cent level. 


Table 2 lists the sample for which the 3rd and/or 4th 
moments were found to be significantly different from the 


TABLE 2. NUMBER OF CURVES SHOWING SIGNIFICANT VALUES 
OF SKEWNESS AND KURTOSIS. 


g, (SKEWNESS) | g, (KURTOSIS) 


SAMPLE 


13* aan ee BIOLOGICAL NOISE 
(eS | ae x CLEAN NOISE 

20 || | |X | UOC ICAISNOISE 
2ST | aaa | ee | XT | | OWE EREOREIUNIEFANKS 
B x CLEAN NOISE 

277 ao X X SHIP NOISE 

287 X xX CLEAN NOISE 

29% (ios Sapeetl eemiete lias Gan he cara | MOSTLY CLEAN NOISE 
42 apes SHIP NOISE 

i a ICE NOISE 

SL es a CINE 

52m ee eX a CEANOISE 

607 X X 60 C/S 

aa ee cs 

69 eat ee es | 

84* pa ee a ee 

100 isc eal aa 


ALL THE FOLLOWING 
ARE SHIP NOISE 
SAMPLES 


— 
Ww 
No} 


*Hypothesis rejected for that sample. 
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expected values. An asterisk by the sample number indicates 
that the hypothesis was rejected by that particular sample 
according to the conditions of the previous paragraph. 
Comments on the type of noise (clean, biological, ship noise, 
etc.) inthe sample are also given. The number of noise 
samples that were found to be significant are listed by lo- 
cation in table 3. Note that the shallow-water locations 

have many more non-gaussian samples than the deep-water 
North Pacific location. 


TABLE 3. LOCATIONS OF CURVES SHOWING SIGNIFICANT VALUES OF 
SKEWNESS AND KURTOSIS AT 1 AND 5 PER CENT PROBABILITY LEVELS. 


NO. OF 


LOCATION SAMPLES 


SHALLOW, 
SO. CALIF. 


SHALLOW, 
ALASKA 


NORTH 
PACIFIC 


SAN DIEGO 
(SHIP NOISE 
IN HARBOR) 


If the hypothesis that the noise samples are taken 
from a gaussian distribution is true, then it is expected 
that one sample in 20 may have significant parameters at 
the 5 per cent level. If it is not true, it is expected that 
more than one sample in 20 may have significant parameters. 


Table 3 shows that the values of skewness for the 
first three locations are weil within the expected number, 
except perhaps for the Southern California location, where 
two samples out of 29 were found with significant amounts 
of skewness. However, since this location had considerable 


biological and other non-gaussian type, noises, a greater 
number of rejected samples is to be expected. The table 
also shows that for kurtosis the expected number (one out 
of 20) was exceeded for both of the shallow-water locations 
but was not exceeded for the deep-water (North Pacific) 
location. For the ship-noise samples, the number of ex- 
pected significant values of skewness and kurtosis are ex- 
ceeded at all levels. Of interest are the number of sig- 
nificant values of skewness in the ship-noise data, since 
there were not many of these for the other three groups. 
For all groups the data indicate that kurtosis is a sensitive 
indicator for the presence of ship noise, biological noise, 
ice noise, and in general any type of noise which has a non- 
gaussian distribution. 


Table 4 indicates the bandwidth of the noise samples 
which had significant values of skewness and kurtosis at the 
1 per cent level. For the most part the bandwidths involved 
are the larger ones. 


TABLE 4. NOISE SAMPLES FROM FOUR LOCATIONS, SHOWING SIGNIFICANT 
MOMENTS OF SKEWNESS AND KURTOSIS AT 1 PER CENT PROBABILITY LEVEL. 


SOUTHERN CALIFORNIA BERING STRAITS 


BANDWIDTH BANDWIDTH | g, (1%) | g, (1%) 


BROADBAND ] 
20-1500 20-1200 2 
a 20-600 0 2 
NORTH PACIFIC ) 
SAN DIEGO (SHIP 
BANDWIDTH g, (1%) NOISE IN HARBOR) 


1 OCTAVE BANDWIDTH 


1600 c.f. 


20-1200 
1/3 OCTAVE 
1600 c.f. 


600-1200 
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An Edgeworth series was fitted to an experimental 
curve using the parameters calculated by this method, to 
insure that they were good estimates to the extent that they 
provided a good fit to the data. Figure 13 is the trace of 
an experimental curve of random noise which shows con- 
siderable skewness. The fit of an Edgeworth series ap- 
proximation to the experimental curve is seen to be very 
good, 


0.5 
f(x) 


0.4 EXPERIMENTAL PD 


0.3 


0.2 


PROBABILITY DENSITY 


THEORETICAL 
NORMAL Ts 


0.1 


AMPLITUDE IN STANDARD DEVIATION UNITS 


Figure 13. Experimental PD curve of random noise showing 
considerable skewness. The points shown are calculated 

using Edgeworth's series with the moments calculated from 
the curve itself by the method of moments described in the 
teste A theoretical normal curve is shown for comparison. 


The chi-square test was performed on a few selected 
curves and the results are shown in table 5. An X marks 
those noise samples for which y*__ was equal to or greater 
than 49.6, which is the tabulated value of y* for 29 degrees 
of freedom at a 1 per cent probability level. The value of 
x” _ is given for the samples for which the hypothesis was 


not rejected by the test. The figure also shows whether 


SAMPLE | x2 


NO. SS ake 
B x (5%) 
14 ied ae 
1S al 7 || 
Ge [SPA Acl aes 
i [OE EG 
(Gt eS Le 
= aS TABLE 5, CURVES FOR WHICH X?_ 
eel Gael WAS COMPUTED, COMPARED 
37 = WITH RESULTS BY METHOD OF 
eS | MOMENTS, THE x's INDICATE 
42 See X(5%) THOSE CURVES WHICH REJECTED 
LI) Ces (eee THE HYPOTHESIS. 
(oe Es 
(Glee eas 
Se er 
a Ss ees 
1a |e ae 
(ee Se ee 
B4 oe 
bs xc ey) 
147 reese X(5%) 
150 X(5%) 


the 3rd and 4th moments were Significant at the 5 per cent 
or 1 per cent level. Generally the results of the chi-square 
test agree with the results of the method of moments. 


Only one hypothesis was tested and this was that the 
noise samples were taken from a gaussian distribution 
with the mean equal to zero and the standard deviation 
equal to one. A better hypothesis is to assume that the 
distribution is gaussian with a mean x = m and standard 
deviation Ss = $ , where ™ and ou are estimated from the 
sample itself. 


Table 6 shows parameters for several noise samples 
which were selected by the overlay normal curve as very 
closely approximating a gaussian curve. The four computed 
moments of each sample are given (the mean, the standard 
deviation, the skewness, and the kurtosis). The ratio of 
the skewness and kurtosis to the square root of their 
variances are given. The computed value of X" iy is given 
for comparison. All three methods agree that these samples 
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TABLE 6. NOISE SAMPLES CHOSEN, BY VARIOUS METHODS, AS BEING VERY CLOSELY GAUSSIAN. 


9 G2 


COMPUTED ene 


SAMPLE STD. DEV. 


NO. 


1 2 apa hy aaa 
(SKEWNESS) |(KURTOSIS)| (var g,)% | (var g,)” 


0 97868 
L 0. 33 
L 0.39 
. 26 


se 


0.05984 | 0.99511 | -0.01290 | 0,014 
0. 06427 0.98920 | -0.01077 0. 030 
I 0. 98493 


Syoye |e |e ie |e |S Is 
fe bean dl Ke on es 
10 | ON | INO JIN [00] OS JU] in 


1 
Oo 
Oo 
Ine} 
1 
co 
wi 
S 
S) 
oS 
No) 
; 


0. 02767 
0. 04498 0.97240 | -0.01331 0.019 
0.3 


a 


oO 
— 


are very closely gaussian. The largest value of chi-square 
occurs for sample no, 112, which has avery good shape; 
however it does have a mean which is different from zero, 
thus causing the large \* .: Sample no, 68 also has a large 


y° for the same reason, 


Cumulative probability graphs of the noise samples 
can reveal if the curve has skewness or kurtosis and can 
also reveal other deviations. However this method was not 
used beyond plotting a few curves. The graphs would per- 
haps require an overlay to estimate skewness and kurtosis 
but, as before, the sample distribution needs to be stan- 
dardized for each sample before the overlay can be applied. 


Analysis with the cumulative probability method was 
not performed, as it was felt that it would not provide much 
more information than the method of moments and the chi- 
Square test, and the estimates would not be as accurate as 
those obtained by the method of moments. This method 
does, however, give a better indication of whether a noise 
sample is gaussian than does the overlay method on the XY 
probability density curves. 


CONCLUSIONS 


PD of Ambient Ocean Noise 


The results obtained from the analysis of ambient 
ocean-noise samples from the three locations indicate that 
the hypothesis (that is, the assumption that the ocean 
samples are taken from a gaussian noise distribution) is 
not rejected when using ''clean'' ambient noise. The 
hypothesis is rejected for ''contaminated'' ambient ocean 
noise. The contamination may be ship noise, biological 
noise, noise from ice (in polar regions), etc. Thus it has 
been shown that ambient ocean noise, under certain con- 
ditions, can be assumed to have a gaussian distribution 
of amplitudes. 


PD of Ship Noise 


The same hypothesis used for the ambient noise 
samples was used to test the ship samples. The results 
indicate that ship-noise samples are not gaussian, since 
six out of nine samples tested rejected the hypothesis that 
the samples were taken from a gaussian distribution. It 
was not found possible to distinguish between the types of 
contamination of the ambient noise; that is, there were no 
obvious differences in the probability density distributions 
for ambient noise contaminated by ship noise and that con- 
taminated by other sources. 


Comparison of Test Methods 


The method of moments was found to be the most 
suitable, of the four methods of data reduction, for providing 
the most accurate and useful estimates of the parameters. 
However, when using this method the tails of the distribution 
should not be left out of the calculations, since their con- 
tribution at the higher moments is very significant. The 
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chi-square test, although it does not provide estimates of 
the parameters of the distribution, does provide a good in- 
dication of the ''fit'' of the sample distribution to the theo- 
retical distribution. The chi-square method can be used 
to give an independent check on the method of moments. 


RECOMMENDATIONS 


1. Use the method of moments described here if 
better accuracy than that given by overlays is desired to 
estimate the moments and to determine whether a sample 
is gaussian or non-gaussian. 


2. Inthe probability density analysis of a noise sample, 
use a range of amplitudes covering at least +4 standard 
deviations; otherwise large errors in the estimates of the 
moments will frequently result. 


3. In future applications of the PDA, have the output 
of the PDA in a digital form rather than a continuous curve, 
so that the data will be in a form more Suitable for the 
calculation of the moments of distribution. 
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APPENDIX A: 


DESCRIPTION OF THE PDA AND ITS OPERATION 


The PDA was designed primarily for the determination 
of the probability density (PD) of random noise. Approxi- 
mations of the PD curves of other waveforms, such as sine 
waves, square waves, and others can also be obtained. 

The PD of a sine wave and square wave as obtained with the 
PDA are shown in figure Al. Also shown are the theoretical 
PD's of the two curves. 
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The PDA has one input and several types of outputs. 
The input level to the PDA is controlled by a precision 10- 
turn potentiometer and the level can be in the range from 1 
volt to 50 volts rms. The frequency range is from dc to 
10 ke/s and this range should never be exceeded. The in- 
put level is set with the aid of a true-rms voltmeter. 


There are two analog outputs. The X (amplitude) 
output is from -10 to +10 volts and corresponds to +5 to -95 
times the rms input level. The Y (probability density) out- 
put is from 0 to about 7.5 volts and corresponds to a PD 
from 0 to 10. However a meter on the instrument gives a 
reading of PD on either of two, scales, from 0 to 0.4, or 
0to 1.0. A PD greater than one can be measured using 
the digital outputs, of which there are two: al-Mc/s and 
a 10-Mc/s output. Both digital outputs are on when the in- 
put signal is inside the amplitude interval Av, at any 
selected amplitude. The PD is obtained by counting the 
number of cycles per second from either digital output. 
The counting interval can be varied as the occasion demands. 


Another output available on the PDA is the pulse out- 
put. This particular output gives one pulse each time the 
signal goes into the amplitude interval, or two pulses for 
signals exceeding the set-in level (one pulse when signal 
passes through on the way up and another pulse when it is 
on the way down). 


Some of the important controls on the PDA are the in- 
put potentiometer (to determine and set input level), the ''X"' 
10-turn calibrated potentiometer (to select the amplitude 
around which the probability density is to be measured), 
and the output-damping-sweep time control (to select the 
averaging time and sweep speed). Other controls deal with 
the calibration of the PDA and are better described in the 
PDA manual.* 


APPENDIX B: 


DETERMINATION OF NUMBER OF DATA POINTS OF 
EACH SAMPLE 


The application of the y* test to our type of data was 
desired as a means of testing the ''goodness of fit" of the 
data to certain theoretical distributions and, in particular, 
the fit to the gaussian distribution. There is some difficulty 
in applying this test directly because /, the total number of 
points in our sample, is not known. However it is possible 
to obtain an estimate of 7 which can be used. The estimate 
of V is also used in determining whether the moments of 
the distribution, that are calculated by the method of moments, 
are significant at a given probability level. The estimate 
of / is found as follows. 


From reference 1 use 


eh (3/8n)2 


on FT wGey Ax oa 


where I. is the input bandwidth of the signal, 7 is the 
sampling interval or atime constant which gives an equivalent 
averaging, and w(x) is the value of probability density ob- 
tained from the PDA. The o_* is the normalized variance of 
the estimate of the probability density, and is not to be 
confused with the variance of the distribution of amplitudes 

of the input signal. In reference 1, V is given as 


itor pee (B-2) 


Here J represents the total number of times that the input 
signal enters the amplitude interval Ax at the (x) of interest. 
After combining equations B-1 and B-2 we have 

2 f_T w(x) 


= 2.97 w (x) (B-3) 
(3/87)= ; 
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The total number of points in the entire sample is 
what is needed and / represents only that number in the 
interval Ax. However, we have normalized data and 
therefore 


Dw(x)Ax = 1 
ane le 

Oo 
Vn = DVAx% = eee 
(3/87)2 


Calculations performed with this equation agree well with 
some data taken using the pulse output of the PDA. Figure 
Bl, curve B, shows the results of equation B-3 for T = 1 
second, and w(x) = w(0) = 0.4, or 
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The theoretical curve is compared with actual 
measurements taken from the PDA pulse output using a 
counter with a 1-second sampling interval. The slight 
difference between the two is due to an error in the actual 
cutoff frequency of the filter. The effective bandwidth of 
the filter is a little larger, about 1.12 times the set value. 
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